Structures of Knowledge from Wikipedia Networks
نویسنده
چکیده
Knowledge is useless without structure. While the classification of knowledge has been an enduring philosophical enterprise, it recently found applications in computer science, notably for artificial intelligence. The availability of large databases allowed for complex ontologies to be built automatically, for example by extracting structured content from Wikipedia. However, this approach is subject to manual categorization decisions made by online editors. Here we show that an implicit classification hierarchy emerges spontaneously on Wikipedia. We study the network of first links between articles, and find that it centers on a core cycle involving concepts of fundamental classifying importance. We argue that this structure is rooted in cultural history. For European languages, articles like Philosophy and Science are central, whereas Human and Earth dominate for East Asian languages. This reflects the differences between ancient Greek thought and Chinese tradition. Our results reveal the powerful influence of culture on the intrinsic architecture of complex data sets.
منابع مشابه
Advertising Keyword Suggestion Using Relevance-Based Language Models from Wikipedia Rich Articles
When emerging technologies such as Search Engine Marketing (SEM) face tasks that require human level intelligence, it is inevitable to use the knowledge repositories to endow the machine with the breadth of knowledge available to humans. Keyword suggestion for search engine advertising is an important problem for sponsored search and SEM that requires a goldmine repository of knowledge. A recen...
متن کاملParticipation and Scientific Collaboration in Persian Wikipedia
Background and Aim: This research studies the effective participation and scientific collaboration in Persian Wikipedia, from 2003-2012. Method: The library method has been used. Also, considering the objectives and the nature of subject, the research method is a descriptive-applied and during its implementation scientometric technique has been used. Excel and SPSS softwares have been used for...
متن کاملWikipedia as Domain Knowledge Networks - Domain Extraction and Statistical Measurement
This paper investigates knowledge networks of specific domains extracted from Wikipedia and performs statistical measurements to selected domains. In particular, we first present an efficient method to extract a specific domain knowledge network from Wikipedia. We then extract four domain networks on, respectively, mathematics, physics, biology, and chemistry. We compare the mathematics domain ...
متن کاملThe influence of network structures of Wikipedia discussion pages on the efficiency of WikiProjects
As a platform for discussion and communication, talk pages play an essential role in Wikipedia to facilitate coordination, sharing of information and knowledge resources among Wikipedians. In this work we explore the influence of network structures of these pages on the efficiency of WikiProjects. Project efficiency is measured as the amount of work done by project members in a quarter. The stu...
متن کاملUnsupervised Knowledge Extraction for Taxonomies of Concepts from Wikipedia
A novel method for unsupervised acquisition of knowledge for taxonomies of concepts from raw Wikipedia text is presented. We assume that the concepts classified under the same node in a taxonomy are described in a comparable way in Wikipedia. The concepts in 6 taxonomies extracted from WordNet are mapped onto Wikipedia pages and the lexico-syntactic patterns describing semantic structures expre...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1708.05368 شماره
صفحات -
تاریخ انتشار 2017